AITopics | epoch number

Collaborating Authors

epoch number

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix A Derivation of (3) Based on the fact that the θ (m) is satisfied with the stationary condition of the lower-level objective function in (2), we obtain

Neural Information Processing SystemsAug-15-2025, 23:06:30 GMT

Masks may vary between each iteration, and the pruned weights are indicated using the light gray color. Different colors of the edges in the neural networks refer to the weight update. The initial learning rate for all the methods are 0.1. All the evaluations are based on a single Tesla-V100 GPU. P do not require additional epochs for retraining.

artificial intelligence, machine learning, pruning ratio, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

57b694fef23ae7b9308eb4d46342595d-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 23:32:43 GMT

epoch number, second wallclock time, wallclock time, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training

Zhang, Jiacheng, Liu, Feng, Zhou, Dawei, Zhang, Jingfeng, Liu, Tongliang

arXiv.org Artificial IntelligenceJun-2-2024

Adversarial training (AT) trains models using adversarial examples (AEs), which are natural images modified with specific perturbations to mislead the model. These perturbations are constrained by a predefined perturbation budget $\epsilon$ and are equally applied to each pixel within an image. However, in this paper, we discover that not all pixels contribute equally to the accuracy on AEs (i.e., robustness) and accuracy on natural images (i.e., accuracy). Motivated by this finding, we propose Pixel-reweighted AdveRsarial Training (PART), a new framework that partially reduces $\epsilon$ for less influential pixels, guiding the model to focus more on key regions that affect its outputs. Specifically, we first use class activation mapping (CAM) methods to identify important pixel regions, then we keep the perturbation budget for these regions while lowering it for the remaining regions when generating AEs. In the end, we use these pixel-reweighted AEs to train a model. PART achieves a notable improvement in accuracy without compromising robustness on CIFAR-10, SVHN and TinyImagenet-200, justifying the necessity to allocate distinct weights to different pixel regions in robust classification.

accuracy-robustness trade-off, perturbation, pixel region, (11 more...)

arXiv.org Artificial Intelligence

2406.00685

Country:

Europe > Austria > Vienna (0.14)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction

Lei, Bowen, Xu, Dongkuan, Zhang, Ruqi, He, Shuren, Mallick, Bani K.

arXiv.org Artificial IntelligenceDec-5-2023

Despite impressive performance, deep neural networks require significant memory and computation costs, prohibiting their application in resource-constrained scenarios. Sparse training is one of the most common techniques to reduce these costs, however, the sparsity constraints add difficulty to the optimization, resulting in an increase in training time and instability. In this work, we aim to overcome this problem and achieve space-time co-efficiency. To accelerate and stabilize the convergence of sparse training, we analyze the gradient changes and develop an adaptive gradient correction method. Specifically, we approximate the correlation between the current and previous gradients, which is used to balance the two gradients to obtain a corrected gradient. Our method can be used with the most popular sparse training pipelines under both standard and adversarial setups. Theoretically, we prove that our method can accelerate the convergence rate of sparse training. Extensive experiments on multiple datasets, model architectures, and sparsities demonstrate that our method outperforms leading sparse training methods by up to \textbf{5.0\%} in accuracy given the same number of training epochs, and reduces the number of training epochs by up to \textbf{52.1\%} to achieve the same accuracy. Our code is available on: \url{https://github.com/StevenBoys/AGENT}.

accuracy, gradient, sparse training, (15 more...)

arXiv.org Artificial Intelligence

2301.03573

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > North Carolina (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks

Qin, Zeyu, Yao, Liuyi, Chen, Daoyuan, Li, Yaliang, Ding, Bolin, Cheng, Minhao

arXiv.org Artificial IntelligenceJun-5-2023

In this work, besides improving prediction accuracy, we study whether personalization could bring robustness benefits to backdoor attacks. We conduct the first study of backdoor attacks in the pFL framework, testing 4 widely used backdoor attacks against 6 pFL methods on benchmark datasets FEMNIST and CIFAR-10, a total of 600 experiments. The study shows that pFL methods with partial model-sharing can significantly boost robustness against backdoor attacks. In contrast, pFL methods with full model-sharing do not show robustness. To analyze the reasons for varying robustness performances, we provide comprehensive ablation studies on different pFL methods. Based on our findings, we further propose a lightweight defense method, Simple-Tuning, which empirically improves defense performance against backdoor attacks. We believe that our work could provide both guidance for pFL application in terms of its robustness and offer valuable insights to design more robust FL methods in the future. We open-source our code to establish the first benchmark for black-box backdoor attacks in pFL: https://github.com/alibaba/FederatedScope/tree/backdoor-bench.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.01677

Country:

North America > United States > California > Los Angeles County > Long Beach (0.05)
North America > United States > Virginia (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Supervising the Multi-Fidelity Race of Hyperparameter Configurations

Wistuba, Martin, Kadra, Arlind, Grabocka, Josif

arXiv.org Artificial IntelligenceJun-1-2023

Multi-fidelity (gray-box) hyperparameter optimization techniques (HPO) have recently emerged as a promising direction for tuning Deep Learning methods. However, existing methods suffer from a sub-optimal allocation of the HPO budget to the hyperparameter configurations. In this work, we introduce DyHPO, a Bayesian Optimization method that learns to decide which hyperparameter configuration to train further in a dynamic race among all feasible configurations. We propose a new deep kernel for Gaussian Processes that embeds the learning curve dynamics, and an acquisition function that incorporates multi-budget information. We demonstrate the significant superiority of DyHPO against state-of-the-art hyperparameter optimization methods through large-scale experiments comprising 50 datasets (Tabular, Image, NLP) and diverse architectures (MLP, CNN/NAS, RNN).

artificial intelligence, configuration, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2202.09774

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
North America > Canada > Quebec > Montreal (0.04)
(21 more...)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A framework for the emergence and analysis of language in social learning agents

Wieczorek, Tobias J., Tchumatchenko, Tatjana, Carvajal, Carlos Wert, Eggl, Maximilian F.

arXiv.org Artificial IntelligenceMay-4-2023

Artificial neural networks (ANNs) are increasingly used as research models, but questions remain about their generalizability and representational invariance. Biological neural networks under social constraints evolved to enable communicable representations, demonstrating generalization capabilities. This study proposes a communication protocol between cooperative agents to analyze the formation of individual and shared abstractions and their impact on task performance. This communication protocol aims to mimic language features by encoding high-dimensional information through low-dimensional representation. Using grid-world mazes and reinforcement learning, teacher ANNs pass a compressed message to a student ANN for better task completion. Through this, the student achieves a higher goal-finding rate and generalizes the goal location across task worlds. Further optimizing message content to maximize student reward improves information encoding, suggesting that an accurate representation in the space of messages requires bi-directional input. This highlights the role of language as a common representation between agents and its implications on generalization capabilities.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2305.02632

Country: